Unsupervised Speech Enhancement Using Dynamical Variational Autoencoders

نویسندگان

چکیده

Dynamical variational autoencoders (DVAEs) are a class of deep generative models with latent variables, dedicated to model time series high-dimensional data. DVAEs can be considered as extensions the autoencoder (VAE) that include temporal dependencies between successive observed and/or vectors. Previous work has shown interest using over VAE for speech spectrograms modeling. Independently, been successfully applied enhancement in noise, an unsupervised noise-agnostic set-up requires neither noise samples nor noisy at training time, but only clean signals. In this paper, we extend these works DVAE-based single-channel enhancement, hence exploiting both signals representation learning and dynamics We propose algorithm combines DVAE prior pre-trained on based nonnegative matrix factorization, derive expectation-maximization (VEM) perform enhancement. The is presented most general formulation then three specific illustrate versatility framework. Experimental results show proposed approach outperforms its VAE-based counterpart, well several supervised noise-dependent baselines, especially when type unseen during training.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling and Transforming Speech Using Variational Autoencoders

Latent generative models can learn higher-level underlying factors from complex data in an unsupervised manner. Such models can be used in a wide range of speech processing applications, including synthesis, transformation and classification. While there have been many advances in this field in recent years, the application of the resulting models to speech processing tasks is generally not exp...

متن کامل

Image Tranformation Using Variational Autoencoders

The way data are stored in a computer is definitively not the most intelligible approach that one can think about even though it makes computation and communication very convenient. This issue is essentially equivalent to dimensionality reduction problem under the assumption that the data can be embedded into a low-dimensional smooth manifold (Olah [2014]). We have seen couple of examples in th...

متن کامل

Deep Unsupervised Clustering with Gaussian Mixture Variational Autoencoders

We study a variant of the variational autoencoder model (VAE) with a Gaussian mixture as a prior distribution, with the goal of performing unsupervised clustering through deep generative models. We observe that the known problem of over-regularisation that has been shown to arise in regular VAEs also manifests itself in our model and leads to cluster degeneracy. We show that a heuristic called ...

متن کامل

Automatic Chemical Design using Variational Autoencoders

We train a variational autoencoder to convert discrete representations of molecules to and from a multidimensional continuous representation. This continuous representation allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Con...

متن کامل

Deep Unsupervised Clustering Using Mixture of Autoencoders

Unsupervised clustering is one of the most fundamental challenges in machine learning. A popular hypothesis is that data are generated from a union of low-dimensional nonlinear manifolds; thus an approach to clustering is identifying and separating these manifolds. In this paper, we present a novel approach to solve this problem by using a mixture of autoencoders. Our model consists of two part...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing

سال: 2022

ISSN: ['2329-9304', '2329-9290']

DOI: https://doi.org/10.1109/taslp.2022.3207349